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Abstract 

Tests based on sample mean vectors and sample spatial signs have been studied in the recent 
literature for high dimensional data with the dimension larger than the sample size. For suit¬ 
able sequences of alternatives, we show that the powers of the mean based tests and the tests 
based on spatial signs and ranks tend to be same as the data dimension grows to infinity for 
any sample size, when the coordinate variables satisfy appropriate mixing conditions. Further, 
their limiting powers do not depend on the heaviness of the tails of the distributions. This is 
in striking contrast to the asymptotic results obtained in the classical multivariate setup. On 
the other hand, we show that in the presence of stronger dependence among the coordinate 
variables, the spatial sign and rank based tests for high dimensional data can be asymptotically 
more powerful than the mean based tests if in addition to the data dimension, the sample size 
also grows to infinity. The sizes of some mean based tests for high dimensional data studied in 
the recent literature are observed to be significantly different from their nominal levels. This 
is due to the inadequacy of the asymptotic approximations used for the distributions of those 
test statistics. However, our asymptotic approximations for the tests based on spatial signs and 
ranks are observed to work well when the tests are applied on a variety of simulated and real 
datasets. 

Keywords: ARMA processes, heavy tailed distributions, permutation tests, p-mixing, ran¬ 
domly scaled p-mixing, spherical distributions, stationary sequences 


1 Introduction 

For univariate data, nonparametric tests based on signs and ranks are well-known competitors of 
tests based on sample means like the t-test. These nonparametric tests have distribution-free prop¬ 
erty, and they are asymptotically more efficient than the mean based tests for non-Gaussian distri- 
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butions having heavy tails. Although various extensions of these nonparametric tests have been pro¬ 
posed for multivariate data (see Puri and Sen (1971), Oja (2010) and Hettmansperger and McKean 
(2011)), they do not have the distribution-free property in general, and they are often implemented 
using their permutation distributions. However, like their univariate counterparts, they are usually 
asymptotically more efficient than the mean based Hotelling’s test for multivariate non-Gaussian 
distributions with heavy tails (see Choi and Harden (1997), Mottonen et al. (1997), Harden (1999) 
and Oja (2010)). 

For high dimensional data, where the data dimension is larger than the sample size, Hotelling’s 

test is not applicable due to the singularity of the sample dispersion matrix. Let Xi, X 2 ,..., X^ 
and Yi, Y 2 ,... , Y„ be i.i.d. copies of independent random vectors X and Y in For testing 
Hq : E(X.) = H(Y) against the alternative Ha ■ E(X.) ^ E{Y) for two high dimensional obser¬ 
vations X and Y, Bai and Saranadasa (1996) proposed a test based on ||X — Y|p, where X and 
Y are the sample means of the two samples. Chen and Qin (2010) proposed a test statistic after 
removing the terms appearing in the expansion of ||X —Y|p, which 

makes the resulting statistic an unbiased estimator of ||K(X — Y)|p. The one sample and the two 
sample statistics of Chen and Qin (2010) based on sample means are 


- Hb 


H2) ^ ^ 

(m)2(n)2 
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respectively, where {p)q = p{p — 1)... {p — q + 1) for integers p > 1 and 1 < q < p. 

Well known multivariate spatial sign and rank based tests (see Mottonen and Oja (1995), 
Mottonen et al. (1997), Choi and Harden (1997), Harden (1999) and Oja (2010)) also involve 
inverses of dispersion matrices computed from the sample, which become singular when the data 
dimension exceeds the sample size. Wang et al. (2015) proposed a one sample test of the mean 
vector based on spatial signs given by 
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where ^(x) = x/||x|| denotes the spatial sign of any x G M'’*. A natural high dimensional version 
of the one sample spatial signed rank statistic can be defined using the idea of Wang et al. (2015), 
and it is given by 

= tV E ^(2*1 + Z*2)'-5(Zi3 + ZiJ. 

(n )4 . ^ . 

* 1 ,* 2 ,* 3:*4 
all distinct 


2 






Similarly, a two sample spatial rank statistic can be defined as 

. m n 

Twmw = , , , . E E S(Yi.-X,J'S(Y,,-Xi.). 

* 1^*2 

Note that T 5 , Tsr and Twmw are unbiased estimators of ||£l{S'(Xi)}|p, ||^l{S'(Xi +X 2 )}|p and 
||£1{S'(X — Y)}|p, respectively. 

In this article, we study the behaviours of different tests based on sample means, spatial signs 
and ranks under various probability models for high dimensional data. In Section 2, we prove 
that under appropriate mixing conditions on the coordinate variables and suitable sequences of 
alternatives, the limiting powers of the spatial rank based test and the mean based tests are the 
same as the data dimension grows to infinity. This is true for all sample sizes and irrespective 
of the heaviness of the tails of the underlying distributions. Analogous results hold for the one 
sample spatial sign and signed rank based tests and the mean based tests, and those are presented 
in subsection 2.1. These results are in striking contrast to the asymptotic results obtained in the 
traditional multivariate setup, where the data dimension is fixed and the sample sizes grow to 
infinity. In such a setup, the multivariate spatial sign and rank based tests are asymptotically less 
efficient than Hotelling’s test for Gaussian distributions, and they are more efficient than the 
test for non-Gaussian distributions with heavy tails (see Mottonen et al. (1997), Ghoi and Harden 
(1997), Harden (1999) and Oja (2010)). Recall that for multivariate Gaussian data, the Hotelling’s 
test is actually the likelihood ratio test and the most powerful invariant test. In Section 3, we 
prove that in the presence of some stronger dependence among the coordinate variables, the limiting 
powers of the spatial sign and rank based tests can be more than those of their competitors based 
on sample means if we first let the data dimension and then the sample size to grow to infinity. 
In Section 4, we demonstrate the performances of the tests based on sample means and spatial 
signs and ranks using some real datasets. In Section 5, we discuss the performances of these tests 
in comparison with some other mean based tests for high dimensional data available in recent 
literature. It is found that the sizes of some of the mean based tests are significantly different 
from their nominal sizes due to the inadequacy of the asymptotic approximations used for the 
distributions of the corresponding test statistics. The proofs of all the theorems are presented in 
Appendix - I. 

2 Asymptotic behaviours of different tests under /^-mixing 

Let X = (Xi,X 2 ,...) be an infinite sequence of random variables defined over a probability space 

{n,A,p). 

Definition 2.1 (Kolmogorov and Rozanov (I960)). A sequence X is said to be p-mixing if p{d) = 
supfc>i supjgj-^ \Corr{f, g)\ converges to zero as d ^ 00 . Here, p{-) is called the p-mixing 
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coefficient of X, and denotes the a-field generated by measurable square integrable functions of 
{X^,X 2 ,...,Xk) fork >1. 

We refer to Lin and Lu (1996) and Bradley (2005) for further details about p-mixing sequences. 
Let Xi, X 2 ,..., Xm and Yi, Y 2 ,..., Y„ be i.i.d. copies of independent random vectors X and Y 
in We assume the following conditions. 

(Cl) X = + V and Y = /r 2 + W for some /ii, /r 2 G R'^, where V and W are vectors formed by 

the first d coordinates of the zero mean, strictly stationary, and p-mixing sequences V = iVi, V 2 ) ■ ■ •) 
and W = {Wi,W 2 , ■ ■ ■) satisfying E{V^) < 00 and E{W^) < 00 . 

(C2) The p-mixing coefficients pi{-) and p 2 {-) ofV and W satisfy Y)^=iPi{‘^^) < 00 andY)’^=i P2{‘2^) 
< 00 , respectively. 

Denote p = p 2 — Pi, = Var{Xi) > 0, = Var{Yi) > 0, Si = DispfX.), and S 2 = DispiY), 

where X = (Xi, X 2 ,... , X^) and Y = (W, Y 2 , ■ ■ ■, Yd). 

(C3) —>■ 0 for some e > 0 and p'^Ei + S 2 )/i = o(tr(Sf + SD) as d ^ 00 . 

Examples of p-mixing sequences include m-dependent sequences, stationary ARMA(p,( 7 ) processes 
with white noise innovation process (see Lin and Lu (1996, Theorem 1.1.2)), and hidden Markov 
models whose underlying generator sequences are stationary, Gaussian and geometrically ergodic 
Markov chains (see Bradley (2005, Theorem 3.7)). For all of the above models, condition (C2) 
holds. Condition (C3) is trivially true under the null hypothesis Hq : p = 0. Note that when Si 
and S 2 are identity matrices, the second part of condition (C3) is automatically true if its first part 
holds. In general, the second part of condition (C3) holds if in addition to the first part, we have 

>^‘ELiAl = o(d>/2+.) 

as d —>■ 00 , where Ai < A 2 < ... < are the eigenvalues of Si + S 2 . 
Chen and Qin (2010) worked in a setup, where X and Y are affine transformations of certain 
zero mean random vectors, whose coordinates are “pseudo-independent” (see (3.2) in p. 811 in that 
paper). The distributional assumptions in (Cl) and (C2) cover many distributions that satisfy the 
model assumptions stated in (3.1) in Chen and Qin (2010, p. 811), e.g., distributions with inde¬ 
pendent coordinates, moving average processes and more generally m-dependent sequences as well 
as autoregressive processes. Fan and Lin (1998) considered the problem of testing equality of two 
mean curves for functional data, and they modelled the data as a finite dimensional one, where the 
data dimension is larger than the sample size. A class of probability models considered by them 
are stationary linear Gaussian processes, many of which satisfy the model assumptions considered 
above. Srivastava et al. (2013) studied a two sample mean based test based on the sum of squares of 
the coordinatewise t statistics and studied its properties assuming multivariate Gaussianity of the 
data, which includes many distributions satisfying Assumptions (Cl) and (C2). A closely related 
test was proposed by Gregory et al. (2014), and they studied its properties under a-mixing (see 
Lin and Lu (1996)) conditions on the data, which is weaker than the p-mixing setup considered 
above. However, those authors required the existence of sixteenth order moments. Cai et al. (2014) 
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proposed a mean based test for detecting sparse alternatives and studied its properties primarily 
under the assumption of multivariate Gaussianity of the data. Feng et al. (2015) proposed a modi¬ 
fication of the test in Srivastava et al. (2013) and they worked in a setup similar to that considered 
by Chen and Qin (2010). Thus, as in the case of the latter paper, many probability distributions 
included in the setup considered by Feng et al. (2015) satisfy the p-mixing assumptions described 
here. Wei et al. (2015) studied the properties of their test under spherical Gaussian distributions, 
which are special cases of the p-mixing models considered here. 

Theorem 2.1. Suppose that conditions (C1)-(C3) are satisfied. Define, Fi = 2tr(Sf)/(m)2 -|- 
2tr(S|)/(n)2 -I- 4fr(SiS2)/(mn). Then, each of [d{al + cf 2 )Twmw — and {T^q — 

||p|P)/ri^/^ converges weakly to a standard Gaussian variable as d ^ oo for every fixed m,n> 1. 

When the null hypothesis Hq : p = 0 is true, the above theorem yields the asymptotic null 

( 2 ) 

distributions of Twmw and Tfig as d —>• oo. Let us observe that the asymptotic distribution of 
f 2') 

Tqq obtained in the above theorem as d ^ oo is the same as that obtained by Chen and Qin (2010) 
in their Theorem 1 when both d, n —>■ oo. These authors used an assumption similar to that in the 
second part of condition (C3) for deriving the asymptotic distribution of their test statistic, when 
both d and n are large (see (3.4) in p. 812 in Chen and Qin (2010)). 

When the alternative hypothesis Ha ; p 7 ^ 0 is true, the next theorem compares the asymptotic 
powers of the tests based on Twmw and T^q for high dimensional data. Let and 

firp(2) (p) be the powers of these two tests at a given level of significance. 

^CQ 

Theorem 2.2. Suppose that conditions (C1)-(C3) are satisfied, and assume limrf_j.oo ||p|P/fJ'^^ = c 
for some c G [0, 00 ]. Then, liuid^oo fiTwMwih') = lim^^oo (t) = P for every fixed m,n > I, 

^CQ 

where fi = a, fi = 1, or fi € {a, 1) according as c = D, c = 00 , or c G (0, 00 ), respectively. Here, a 
is the level of significance of the test. 

The above theorem implies that the asymptotic powers of the mean based and the spatial rank 

based tests are the same as d —)• 00 for each fixed m, n > 1. If Si and S 2 equal the d x d identity 

( 2 ) 

matrix, and d is large, we get different powers of the tests based on Twmw and Tfig according as 
||p||/d^/^ converges to zero, infinity or some c G ( 0 , 00). 

2.1 Empirical study using some p-mixing models 

( 2 ) 

For implementing the tests based on T\ymw and Tfig under the p-mixing setup, we can use their 
limiting null distributions obtained from Theorem 2.1 after plugging-in the following unbiased 
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estimators of the parameters involved. 


where 


hi = -A-tr(Sf) + -^tr(S^) + —tr(SiS 2 ), 

[m )2 [n )2 mn 

S [(Xi.-X„)'(Xi,-X,J]2, 

all distinct 

= E [(Y,i-Y,,)'(Y, 3 -Y,J] 2 , and 

all distinct 


tr(SiS2) 


1 

4(m)2(n)2 






Also, cjf = [d{m-l)] ^ Ylk=i YliLii^ik-Xk)'^, where Xfc = d ^ Ya=i Xik with Xj = (Xii,Xi 2 , • • •, 

Xid)-, 1 < r < m, and al = [d{n - 1)]“^ Y1=i YTj=i{Xjk - Yk)'^, where Yk = d~^ Yl]=i Yjk with 

Yj = (h^i, Y^' 2 ) ■ ■ ■; Yjd), 1 < j < n. Note that Fi is invariant under location transformations unlike 

the estimator proposed by Chen and Qin (2010, p. 815). Moreover, for all simulated datasets and 

real datasets considered later, the empirical sizes and powers of the test based on T^q implemented 

as above are similar to those of the original two sample test in Chen and Qin (2010). 

( 2 ) 

To compare the performances of the tests based on Twmw and Tqq, we have considered the 

AR{1) models with correlation 0.7 having Gaussian and t(5) innovations. The sample sizes are 

m = n = 20, and /i = (c, 0,0,... , 0) with c = 1.5,3,4.5, 6 , 7.5 for d = 100,200,400,800,1600, 

( 2 ) 

respectively. The sizes and the powers of the tests based on Twmw and T^g are averaged over 
1000 Monte Carlo simulations. We found that the sizes of the tests are not significantly different 
from the nominal 5% level for both the models. It is seen from Figure 1 that the powers of these 
two tests are similar for all data dimensions considered under both the models. The power curves 
are so close that they are overlaid on each other. 



12345 12345 

Iog2(d/50) Iog2(cf/50) 


Figure 1: Powers of the tests at nominal 5% level based on Twmw (- + - curves) and T^g (- 
o - curves) for the AR{1) model with Gaussian innovation (left panel) and t(5) innovation (right 
panel). The two power curves are overlaid on each other in both the plots. 
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2.2 Asymptotic behaviours of one sample tests under p-mixing 

Let Xi, X 2 ,... ,X„ be i.i.d. copies of a random vector X € The following theorem gives the 
asymptotic distributions of T 5 , Tsr and T^q and compares their asymptotic powers, when the 
data dimension is large. Denote PTgifJ^), /StsaCp) and (3 n) (p) to be the powers of the tests based 

^CQ 

on Ts, Tsr and T^q at a given level of significance, when the alternative hypothesis Ha : p = 0 is 
true. Let us assume the following condition, which is the one sample version of condition (C3). 
(C4) —>■ 0 for some e > 0 and p'Sp = o{tr{T?)) as d ^ 00 , where S = DispfX). 

Theorem 2.3. Let X = p + V, where V is the vector formed by the first d coordinates of the 
infinite sequence V satisfying conditions (Cl) and (C2), and p satisfies condition (C4)- Define 
r 2 = 2 tr(S^)/(n) 2 , and cr^ = Var{Xi), where X = (Ai, A 2 ,... ,Xd). 

(a) Each of {da'^Ts - ||p|p)/r 2 '^^, {da‘^TsR - 2 ||p|p)/( 2 r 2 '^^) and - ||p|p)/r 2 '^^ converges 

weakly to a standard Gaussian variable as d —)■ 00 for every fixed m,n> 1. 

(b) Assume limd^oo I l/^l = c for some c € [ 0 , 00 ]. Then, lim^^oo Pts ih) = limrf -^00 (^Tsr ih) = 

limrf_>.oo /3„{i) (p) = (3 for every fixed m,n > 1, where /3 = a, (3 = I or /3 G (a, 1 ) according as c = 0, 

^CQ 

c = 00 , or c € ( 0 , 00 ), respectively. 

We get the limiting null distributions of T 5 , TgR and T^g when p = 0 in the above theorem. 
When both the data dimension and the sample size grow to infinity, Wang et al. (2015) proved 
that the test based on Ts is asymptotically as powerful as the test based on T^ig for spherical 
Gaussian distributions, which is a distribution included in our p-mixing model. The equality of the 
asymptotic powers of the tests based on Tp and T^q stated in part (b) of our Theorem 2.3 holds 
for any sample size and for many non-spherical distributions. 

Remark 2.1. In both the one and the two sample problems, when our p-mixing model for the 
data holds, the equality of the limiting powers of the tests based on sample means and the tests 
based on spatial signs and ranks, when the data dimension is large. This is true for any sample 
size and irrespective of whether the coordinate variables have Gaussian or some other heavy tailed 
distributions. 

3 Asymptotic behaviours of different tests under stronger depen¬ 
dence 

We now consider another class of probability models for high dimensional data, where there is 
stronger dependence among the coordinate variables than what we have considered in the previous 
section. 

Definition 3.1. Consider an infinite sequence X defined over a probability space {Ll,A,P). We 
say that X is a randomly scaled p-mixing sequence (RSRM sequence, say) if there exist a zero mean 
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p-mixing sequence TZ and a positive non-degenerate random variable U defined on (0, P) such 
that X = TZ/U. 

The RSRM property is satisfied by many important probability models for high dimensional 

data. For instance, the infinite seqnence of random variables associated with the mnltivariate 

spherical t distribntion has this property. In fact, by Theorem 1.31 in Kallenberg (2005), it follows 

that any rotatable sequence T, i.e., a sequence for which all finite dimensional marginals are 

spherically symmetric, can be viewed as a RSRM sequence. Here, TZ can be taken as a sequence 

of i.i.d. standard Gaussian variables and U as a non-negative random variable independent of TZ. 

More generally, if every finite dimensional marginal of a sequence X is elliptically symmetric, then 

X = TZ/U with probability one, where TZ\s a sequence of zero mean Gaussian variables, and U is 

a non-negative random variable independent of TZ. In this case, X has the RSRM property if the 

Gaussian sequence 7^ is a p-mixing sequence. Let us mention here that Wang et al. (2015) primarily 

worked under the setup of elliptically symmetric models, and from the above discussion it follows 

that this class includes many distributions that have the RSRM property. Gai et al. (2014) also 

considered different classes of non-Gaussian models, and many of them have the RSRM property. 

( 2 ) 

For deriving the asymptotic distributions of Twmw and T/,q under the RSRM model, we 
assume the following. 

(C5) X = Pi -|- V and Y = p2 + W for some pi, P2 G where V and W are vectors formed by 
the first d coordinates of RSRM sequences V and W. Let V = V/P and W = W/Q, where V and 
W are independent p-mixing sequences satisfying (Cl) and (C2), and P and Q are independent 
positive random variables. 

As earlier, let Xi, X 2 ,..., X^ and Yi, Y 2 ,..., Y„ be i.i.d. copies of independent random vectors 
X and Y in Then, we can write Xj = pi -|- Vj/Pj, 1 < i < m, and Yj = p 2 + Wj/Qj, 
1 < j < n. 

Theorem 3.1. Assume that (C5) holds, and p = p 2 — pi satisfies condition (C3) with Si and S 2 
in that condition replaced by Dispi/V) and Dispi/W), respectively. 

(a) There exist random variables Si, S 2 and S 3 that are functions of the Pi’s and the Qj ’s such that 

each of {dT\YMW — ^i )/nnd {Tqq — HpIP)/*?^^^ converges weakly to a standard Gaussian 

variable as d —>■ 00 for every m,n > 1. Consequently, for every fixed m,n > 1 , the distributions 
of Twmw ond Tfig can be approximated by location and scale mixtures of Gaussian distributions, 
when the data dimension is large. 

(b) Assume further that all of E{P),E{Q),E{P~‘^) and E{Q~‘^) are finite, and ||p|P/d^^^ tends 
to a finite non-negative limit as d —>■ 00 . Then, there exist real numbers fii and 1/2 such that 

limrn,n^oo'^^WLd^oo P{{dTwMW “ I l/^l PV'i)/V' 2 '^^ < x} = P {{TcQ “< 

x} = <h(x) for all X € M. Here, <h is the cumulative distribution function of standard Gaussian 
distribution, and Fi is as defined in Theorem 2.1. 
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Unlike the setup considered in Section 2, where the coordinate variables are p-mixing, here the 

distributions of Twmw and T^q cannot be approximated by Gaussian distributions when m and n 

are small even if d is large. However, if the sample sizes are also large in addition to data dimension, 

we can approximate the distributions of these statistics by Gaussian distributions. It is easy to see 

that many probability models with the RSRM property do not satisfy the model assumptions in 

(3.1) in Ghen and Qin (2010). Nevertheless, the asymptotic distribution of T^q obtained from part 

(b) of Theorem 3.1 coincides with that obtained in Theorem 1 in Chen and Qin (2010). Further, it 

also coincides with the Gaussian distribution obtained under the p-mixing model in Theorem 2.1. 

f 2') 

Let Ptwmw id) /3^{2) (p) denote the powers of the tests based on Ty/MW and T),q under 
the alternative hypothesis Ha : p 7 ^ 0 at a given level of significance. The next theorem gives a 
comparison of the asymptotic powers of these tests. 

Theorem 3.2. Assume thatY' has the same distribution as X + p. Suppose that all the conditions 
assumed in Theorem 3.1 hold. Also, assume that limm,n^cxD limrf_j.oo = c for some c G 

(0, 00 ). Then, lim^.n-^cx) hm^^oo )twmw id) > hm^.n-^oo hm^^oo /3.p(2) (p). 

^CQ 

If limm,n^cxD limrf^oo equals zero (respectively, infinity), then the asymptotic powers 

( 2 ) 

of the tests based on Twmw and Tf,Q in the setup of Theorem 3.2 coincide, and they are both equal 
to the nominal level (respectively, equal to one). Theorem 3.2 shows that for appropriate sequences 
of alternatives, the test based on Twmw is more powerful than the test based on T^q for a large 
class of distributions including many spherical non-Gaussian distributions, when the data dimension 
as well as the sample sizes are large. Note that if X and Y have spherically symmetric distributions, 
then the conditions on p in Theorems 3.1 and 3.2 hold if limm,n^oo hmrf_^oo(^ + = 

c' G (0,oo), and liuim^n^oo m/ {m + n) = 7 G (0,1). 

3.1 Empirical study using some RSRM models 

The limiting null distribution of Twmw obtainable from Theorem 3.1 cannot be used to implement 
this test because the parameters appearing in its limiting distribution cannot be estimated from 
the data. To compare the performances of the tests based on Twmw and Tf.q for data from the 
spherical t(5) distribution, we implemented these tests using their permutation distributions. Such 
an implementation has also been used by Wei et al. (2015) for their test. Though it is not possible 
to implement the test based on Twmw using its true asymptotic distribution in practice, we can do 
it for a simulation study, where the distributions and the associated parameters are known. On the 
other hand, since the true asymptotic null distribution of Tf,Q for RSRM models coincides with its 
asymptotic null distribution in the p-mixing setup, the implementation of this test can be done in 
the same way as described in subsection 2.1. We have chosen m = n = 20, and p = (c, 0, 0,... , 0) 
with c = 1,1.5, 2, 2.5,3 for d = 100,200,400,800,1600, respectively. Figure 2 shows that the sizes 


9 


and the powers of these tests obtained by using the permutation implementation are not signifi¬ 
cantly different from the sizes and the powers of the tests implemented using their true asymptotic 

distributions. The permutation distributions of Ty/MW and T^q adequately approximates their 

( 2 ) 

true distributions. Also, the test based on T\ymw significantly outperforms the test based on T^g, 
which conforms with the result in Theorem 3.2. 



Iog2(d/50) 



f2') 

Figure 2: Empirical sizes and powers of the tests based on Twmw (+) and T^q (o) at nominal 5% 
level for the spherical t{5) distribution using the permutation implementation (solid curves) and 
the true implementation (dashed curves). 


3.2 Asymptotic behaviours of one sample tests under stronger dependence 

We will now study the asymptotic distributions of the one sample tests considered in subsection 
2.1 under the RSRM model. Let Xi,X 2 ,... ,Xn be i.i.d. copies of a random vector X G The 
following theorem summarizes the asymptotic distributions of Tg, Tsr and T^q and yields their 
asymptotic powers. As earlier, we can write X* = /i-|-Vj/Pj, 1 < i < n. Also, /3t5h(/^) and 

(/i) denote the powers of the tests based on Ts, Trr and T^q at a given level of significance, 

CQ 

when the alternative hypothesis Ha ; /U 7 ^ 0 is true. 

Theorem 3.3. Let X = -|- V, where V is the vector formed by the first d coordinates of the se¬ 

quence V satisfying condition (C5), and fi satisfies condition (C4) with S in that condition replaced 
by Dispfy). 

(a) There exist Ts > 0 and random variables Zk, 1 < fc < 4, which are functions of the Pi’s, such 
that each of {dTs — Zi), {dTsR — 2\\pL\\'^Z 2 )/{2Zy‘^) and {T^q — \ \ij,\\‘^)/zy^ converges 
weakly to a standard Gaussian variable as d ^ 00 for each m,n > 1. Consequently, for each fixed 
m,n> 1, the distributions ofTs, Tsr and T^q are given by location and scale mixtures of Gaussian 
distributions, when the data dimension is large. 

(b) Also, assume that both E{P) and E{P~'^) are finite, and tends to a finite non¬ 

negative limit as d —)■ 00 . Define ci^ = Var{Xi). There exist real numbers Of^, 1 < k < 
3 such that lim.n^co'^^rnd^oo P{{da'^Ts - \\p\\‘^9i)/Ty‘^ < a;} = limn^oo'^^rad-^oo P{{da‘^TsR - 
2 ||l^|P^' 2 )/( 26 ' 3 '^^) < x) = limn^co li^T-d-^oo PiiTcQ - WnW^)< x) = $(x) for all x G M. 
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Here, denotes the cumulative distribution funetion of a standard Gaussian distribution, and r 2 
is as defined in Theorem 2.3. 

(c) Further, ifwe /ei = c, where c ^ (0,oo), we havelimn^oo^^^d^ca fixsih) > 

lim„_,.oo linirf^oo /3^(i) ih)- We also have linid^oo Ptsr{t) > limn^oo /3^{i) ih)- 

^CQ ^CQ 

It is seen from the proof of part (a) of Theorem 3.3 that if E{P~‘^) < oo, we have Ts = a~^T 2 . 

In this case, we get the same limiting null distributions of T 5 from parts (a) and (b), i.e., its 
limiting null distribution is Gaussian irrespective of whether the sample size grows to infinity or 
not. Further, this limiting null distribution under the RSRM model is the same as that obtained 
under the p-mixing model in part (a) of Theorem 3.1. This is because the spatial sign S(x) = x/||x||, 
and thus T 5 , remain invariant under homogeneous positive scale transformations of the coordinate 
variables. 

Note that the asymptotic distribution of T^q is the same as that obtained in Theorem 3.2 
under the p-mixing setup, and it coincides with the asymptotic distribution of T^q obtained by 
Chen and Qin (2010). For the spherical t distribution, which is a distribution included in our 
RSRM models, Wang et al. (2015) derived the asymptotic distribution of and proved that the 
test based on T 5 is asymptotically more powerful than the former test. In the setup of Theorem 3.3, 
if lim^^oo hm^_>,oo H/ilP/F^'^^ equals zero (respectively, infinity), then the asymptotic powers of the 
tests based on T 5 , Tsr and T^q coincide, and they are all equal to the nominal level (respectively, 
equal to one). 

Remark 3.1. Suppose that in a two sample problem, Y is distributed as X+/i, where X is the vector 
formed by the first d coordinates of a zero mean spherically symmetric or rotatable infinite sequence 
X. Then, it follows from Theorem 1.31 in Kallenberg (2005) that X = V /P, where \ is a standard 
spherical Gaussian vector, and P is a non-negative random variable independent of V. Suppose 
thatlimm,n^oo^iTXid^ooim-\-n)\\fj.\\'^/d^/'^ = c' G ( 0 , 00 ) and\iinm,n^oom/{m-\-n) = 7 G ( 0 , 1 ). Also, 
assume that both E{P) and E{P~‘^) are finite and positive. Then, it follows from Theorems 2.2 
and 3.2 that the test based on Twmw is asymptotically at least as powerful as the test based on Tfig 
if we first let the dimension and then the sample sizes grow to infinity. Further, their asymptotie 
powers are equal if and only if X has a spherical Gaussian distribution. In fact, in this case, their 
asymptotic powers are the same for any sample sizes if only the dimension grows to infinity. 

Remark 3.2. Suppose that in a one sample problem, we have X = /r + V, where V is the vec¬ 
tor formed by the first d eoordinates of a spherieally symmetrie infinite sequenee. Assume that 
lim„_,.oo hmrf_).oo = c' G (0, cx)). Also, let both E{P) and E{P~‘^) be finite. Then, it 

follows from Theorems 2.3 and 3.3 that the tests based on Ts and Tsr are asymptotically at least 
as powerful as the test based on T^q if we first let the dimension and then the sample size grow to 
infinity. Further, the asymptotie powers of all three tests are equal if and only if the distribution of 
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X is spherical Gaussian. In fact, in this case, their asymptotic powers are the same for any sample 
size if only the dimension grows to infinity. 

4 Analysis of real data 

( 2 ) 

We now investigate the performances of the two sample tests based on Twmw and on some 
real datasets, when they are implemented in two different ways, namely, as in the p-mixing setup 
described in subsection 2.2, and using their permutation distributions. Two datasets are obtained 
from http://www.cs.ucr.edu/~eamonn/time_series_data, and the first of them is the ECG 
Data, which contains 69 normal ECG curves and 31 ECG curves of patients with a particular 
heart disease, and each curve is measured at 96 time points. The second data is the Gun Data, 
which contains the readings along the horizontal axis of the centroid of the right hand during two 
action sequences, namely, gun-draw and gun-point with 24 samples and 26 samples, respectively. 
Each action sequence is recorded at 150 time points. The third data is the Colon Data, which 
is obtained from http://datam.i2r.a-star.edu.sg/datasets/krbd/ColonTumor/ColonTumor. 
zip and contains the expression levels of 2000 genes from 40 tumor tissue and 22 normal tissue. The 
fourth data is the Sonar Data obtained from http://archive.ics.uci.edu/ml/datasets.html, 
which contains sonar signals emitted from 111 metal cylinder samples and 97 rock samples, and 
each signal is recorded at 60 wavelengths. To estimate the sizes of the tests based on T\ymw and 
Tf,Q for each data, we selected two random subsamples 1000 times from one class in that data 
and computed the proportion of rejections for each test. The same procedure is now repeated for 
the other class and the two values obtained for each test are averaged. Eor evaluating the powers 
of these tests, we selected 1000 random subsamples each from the two classes and computed the 
proportions of rejections for the tests. The size of each subsample is 20%, 40%, 40% and 20% of 
the original sample size for the ECG Data, the Gun Data, the Colon Data and the Sonar Data, 
respectively. These choices are made to ensure that the resulting datasets remain high dimensional, 
and the powers of the tests are neither too close to the nominal 5% level nor to one. Eor computing 
the permutation distributions of the test statistics, we have used 500 random permutations of the 
two subsamples. 

Table 1 shows that the sizes as well as the powers of the tests for the two implementations are 

not significantly different. However, the permutation implementation required almost ten times 

more computing time. Moreover, the sizes of the tests are close to the nominal 5% level for all 

( 2 ) 

the four datasets. Eurther, the powers of the tests based on Twmw and Tf,Q are not significantly 

different for the ECG data and the Gun data. However, the test based on Twmw is significantly 

( 2 ) 

more powerful than the test based on Tf,Q for the Golon data and the Sonar data. 
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Table 1: Sizes and powers of the tests based on T\ymw and T^q at nominal 5% level for some real 
data. _ 


Data —)■ 

EGG 

Gun 

Golon 

Sonar 


Implementation as 

in the p-mixing setup 




Size 

Power 

Size 

Power 

Size 

Power 

Size 

Power 

Twmw 

0.052 

0.593 

0.052 

0.501 

0.056 

0.747 

0.036 

0.507 

ry-i(2) 

^CQ 

0.063 

0.601 

0.058 

0.500 

0.063 

0.641 

0.058 

0.432 

Permutation implementation 


Size 

Power 

Size 

Power 

Size 

Power 

Size 

Power 

Twmw 

0.057 

0.643 

0.055 

0.472 

0.055 

0.723 

0.043 

0.519 

rp{‘2) 

^CQ 

0.057 

0.624 

0.052 

0.442 

0.060 

0.596 

0.038 

0.360 


5 Concluding remarks and discussion 

We now consider the performances of some other mean based tests studied in the literature and 
discussed in Section 2 on some simulated datasets. We denote the test statistics associated with the 
tests in Srivastava et al. (2013) and Gregory et al. (2014) by Tskk and Tgcbl, respectively. For 
the AR{1) models in subsection 2.1, we found that the size of the test based on Tskk increases with 
d and becomes significantly larger than the nominal 5% level for d > 400. Feng et al. (2015) proved 
that the size of this test converges to one as the dimension and the sample sizes grow to infinity at 
a certain rate for a class of models, which include these ^i?(l) models. Under the spherical t(5) 
model in subsection 3.1, the size of the test based on Tskk is significantly less than the nominal 
level for all values of d considered and decreases to zero as d increases. The size of the test based 
on Tgcbl is signihcantly larger than the nominal level for all values of d considered under the 
AR{\) models as well as the spherical t(5) model. It seems that the estimates of the critical values 
for the tests based on Tskk and Tgcbl are adversely affected if the sample size is much smaller 
than the dimension as in our simulation study. On the other hand, we found that permutation 
implementations of these tests correct their sizes under all of the above models. Even then, these 
tests are significantly less powerful than the test based on Twmw (respectively, T^q) under all the 
above models (respectively, AR{1) models) but they outperform the test based on Tqq under the 
spherical t(5) model. The readers are referred to Appendix - III for more details. 

Cai et al. (2014) showed that their test has better power than other tests based on sum of 

squares of coordinatewise mean difference or coordinatewise t statistics, when the mean shift has 

only a few non-zero coordinates. However, we observed that this test becomes significantly less 

( 2 ) 

powerful than the tests based on Twmw and T^g, when the mean shifts in the models considered 
in subsections 2.1 and 3.1 are distributed equally among all the coordinates. Moreover, the size 
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of the test in Cai et al. (2014) increases with d and becomes significantly larger than the nominal 

level for d > 400 under all of the above models. It seems that the asymptotic extreme value 

distribution of this statistic is not adequate if the data dimension is much larger than the sample 

size. Since the test in Cai et al. (2014) involves a computationally intensive optimization involving 

sample dispersion matrices, we could not implement this test using the permutation approach. The 

detailed results of the simulation study are provided in Appendix - III. 

Multivariate Gaussian distributions with dispersion matrices of the form (1 — /I)/^ + for 

some (3 G (0,1), where denotes the d-dimensional vector of one’s, are neither p-mixing nor have 

the RSRM property. Recently, Katayama and Kano (2014) mentioned that for such probability 

( 2 ) 

models for high dimensional data, the size of test based on T^q would be asymptotically incorrect. 
To compare the performance of the tests based on Twmw and T^q for such models, we have 
chosen (3 = 0.7, m = n = 20 and used the permutation implementations of these tests. The mean 
shifts chosen are fi = (c, 0, 0,... , 0) with c = 2.5,5,7.5,10,12.5 for d = 100,200,400,800,1600, 
respectively. We found that the test based on Twmw significantly outperforms the test based on 

/o\ 

T^q for all values of d (see Appendix - III). 


Appendix — I 

Proof of Theorem 2.1. Without any loss of generality, we can take K(Xi) = 0. Let us write 
Xj = (Xji, Xj 2 ,..., Xid)', 1 < i < m, and Yj = {Yji,Yj 2 ,..., Yja)', 1 < j <n. First note that 

||X-Y||2 = [||x||2 + ||Y-/x||2-2X'(Y-p) + 2/i'(X-Y + p) + |M|2] 

= + 2p'(V - W) + ||p||2, (5.1) 

k=l 

It follows from Bradley (2005, Theorem 5.2(b)) that for any function /i : —>• M, the sequence 

{h{Vk,Wk) : A: > 1) is p-mixing with its mixing coefficient bounded by max{pi(-),p 2 (-)}. This fact, 
(5.1) above along with Assumptions (C1)-(C3) and Theorem 8.2.2 in Lin and Lu (1996) imply that 
for any given e G (0,1/2), we have 

||X - Y\f/d - (af + a^) = o(d-^/2+e) ^52) 


as d ^ oo almost surely. Now, 
1 


Twmw = 


(m)2(n)5 


+ 


1 


= {T3l + T!l’M„)/[d{al+4)}, 


EE 


E E 


X,, -Yj,)'(X,, -Y 


■J 2 > 


d{al + al) 

(X,,-Y,J'(X,,-Y, 




d{al + al) 


d{al + al) 


|Xn-Y,J 


\^^2 -^n\ 


- 1 
(5.3) 


14 









where T^q = 
and 

7^(2) _ 

-^WMW ~ 


[M 2 (n) 2 ]-^E,^., 


1 

(m) 2 (n )2 


E E 

*1^*2 jl¥=j2 


- Y, J'(X,, - Y,,) as defined in the Introduction, 

' (Xn-Y,-,y(X,,-Y,,) f d{aj+al) _ V 

d{al + al) 1||X,,-Y,J| ||X,,-Y,,|| ' 


So, E{T^q) = ll/^lp. Further, it follows from Chen and Qin (2010, p. 825) that Var{T^Q) = 
Fi + 4/i'Si/i/m + 4/r'S2/r/n, where Fi = 2 tr(Sf)/(m )2 + 2 tr(S|)/(n )2 + 4tr(SiS2)/(mn) is defined 
in the statement of the theorem. Note that (/i'Si/r/m) + (/i'S 2 /u/n) < /r'(Si + S 2 )/i/min(m,n). 
Also, the denominator of each of the three terms in Fi is less than {N)2, where N = max(m,n). 
This implies that Fi > [2tr(Sf) + 2tr(S|) + 4tr(SiS2)]/(X)2 = 2tr[(Si + S 2 )^]/(X) 2 . These facts 
and Assumption (C3) imply that Var{TQQ) = ri(l + o(l)) as d —>■ oo. Further, 



x:(x..-Y,.+rf(x„-Y„+rt 

* 1^12 jl^i 2 

Ti - r 2 . 


2 

mn 


hj 


where Ti = [(m) 2 (n) 2 ] ^ Eji^j 2 (Xn-Yjj+^)'(Xi 2 -Yj 2 +/x) and T2 = 2{mn) ^ Y.i,j 

Yj + /i). It is easy to verify that E{T2) = 0 and Var{T2) = 4//'[(Si/m) + (S 2 /n)]/x. So, using the 
inequality Fi > 2tr[(Ei + S 2 )^]/(X) 2 , Assumption (C3) and Chebyshev’s inequality, it follows that 
T2/T^ converges to zero in probability as d —>■ 00 . Note that 

= (Ssts EI E E - »'»'=) ■ 


So, E[Ti) = 0 and Var[Ti) = Fi. This follows from computations similar to those used in deriving 
( 2 ) 

Var{T^Q) earlier. Thus, by Theorem 4.0.1 in Lin and Lu (1996) and Assumptions (Cl) and (C2), 

1 /2 

we have the weak convergence of Ti/T^ to a standard Gaussian distribution as d —>■ 00 for each 

( 2 ) 

fixed m, n > 1. This and the fact that T 2 ' converges to zero in probability as d —>■ 00 for each fixed 
m, n > 1 together imply that 

(7^S-IMP)/r}/'^x(0,i) (5.4) 


as d ^ 00 for each hxed m, n > 1. Next, let us write 


r, 


( 2 ) 


WMW 


/ry^ = 


1 


(m) 2 (n)s 


E E 

*i^* 2 ii^i 2 - 


X,,-Y,J'(X,,-Y, 3 )- 


U/2 


d{al + aj) 


+ 


{m) 2 {n) 2 Ty^ ^ 


|Xn-Y,J 


E E 


IX ^2 Yj2 


T ^ 

d(af + aj) 


iXi, - Y,J| ||Xi2 - Yj^l 


- 1 


_ rp{ 3 ) _i_ rpy 

~ ^WMW^^WMW' 


44) 


(5.5) 
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where 


(m)2(n)2 y2 

*17^*2 L 1 


rp{^) _ 

-^WMW ~ 




As mentioned earlier, Fi > 2tr[(Si + S 2 )^]/(Y) 2 . Also, from the stationarity of the sequences X 
and y and using the Cauchy-Schwarz inequality, it follows that tr[(Si + ^ 2 )^] > + o'|)^. These 

facts along with (5.2) and Assumption (C3) imply that each term inside the double summation 
appearing in the definition of T\yMW converges to zero in probability as d —>■ 00 for each fixed 
m, n > 1. So, converges to zero in probability as d —>• 00 for each fixed m, n > 1. 

Next, fix any ii / i 2 and ji / j 2 and consider the corresponding term inside the double 
summation appearing in the definition of It follows from (5.2) that d{al + cr|)/[||Xjj — 

YjJI llXjj — YJ 2 II] — 1 converges to zero in probability as d —)■ 00 . Also, note that 


(X,, - Y,J'(X,2 - Y,2) - M? = (X,, - Y,1 + /i)'(X,2 - Y,2 + /i) (5.6) 

— p, (Xjj^ — Yjj -\- p) — p (Xjj ~ Yj2 + //). 

( 2 ) 

Using arguments similar to those used to prove the asymptotic normality of and using The¬ 
orem 4.0.1 in Lin and Lu (1996), it follows that the first term in the right hand side of (5.6) is 
asymptotically Gaussian with zero mean and variance 2tr[(Si-|-S2)^] as d —>• 00. Using Assumption 

(C3) and Chebyshev’s inequality, it follows that the second and the third terms in the right hand 

1 /2 

side of (5.6) after scaling by T^' converge to zero in probability as d — 00. So, the left hand side 
of (5.6) after scaling by T^' converges weakly to a Gaussian distribution as d —>■ 00. Thus, 
converges to zero in probability as d —>■ 00 for each fixed m,n> 1. This and the fact that T^j^yy 
converges to zero in probability as d —>■ 00 together imply that converges to zero in 

probability as d ^ 00 for each fixed m,n>l. Gombining this fact with (5.3) and (5.4) yields 

{dial + al)TwMW - 4 Y(0,1) (5.7) 

as d ^ 00 for each fixed m, n > 1. □ 


Proof of Theorem 2.2. Let Ca be the (1 — a)-quantile of the standard Gaussian distribution. Note 
that 


t^TwMwik) = P{d{<yl + (yl)TwMW> C,a} 

= P{[d{al+al)TwMW - > Ca - M? 
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and 


/3^(.) (/i) = P{T^%/ry^ > Ca} = P{{Tg^ - M\^)/ry" >Ca- 

where the probabilities are computed under the alternative hypothesis. Since limrf_^oo 
exists, the equality of the asymptotic powers of the tests based on Ty/MW and T^q follows from 
(5.4) and (5.7). Moreover, their common value is <h(—Co+ limf;_>.oo ll/^lP/hi^^) = 4>(—Co + c), which 
follows from the expressions of their powers and their asymptotic Gaussian distributions proved in 
Theorem 2.1. The last part of the present theorem now follows easily. □ 


Proof of Theorem 2.3. (a) We will derive the asymptotic distribution of Tsr and T^g only, since 
the derivation of that of Ts is simpler and follows from similar arguments. Using the assumptions 
in the theorem and the arguments similar to those in the proof of Theorem 2.1, we have 


Tsr = 


{n)4 


E 


(Xi, +Xj,)'(Xj, +x. 


14, J 




+ 


1 


(n)4 


E 


2 da‘^ 

(X,,+X,,)'(X,3+X,J 




2 da‘^ 


E 


X'.Xi. 1 


+ 


2do'^ (llXjj+Xjjll ||Xj3+X4| 

(X,,+X,3)'(X,3+X,J^^ 


- 1 


E 


2 da^ 


2 da‘^ 


(5.8) 


X,^ +Xi2l| ||Xi3 +X, 


- 1 


*3 


The first term in (5.8) equals 2Tgg/((i(T^). Using Assumption (C4), it can be shown that E{tYq) = 
||/i|P and Uor(r^g) = r 2 (l + o(l)) as d —>■ oo. Using arguments similar to those in the proof 
of Theorem 2.1, we have the weak convergence of {Tcq ~ ll 7 ‘lP)/' 02 ^^ ^ standard Gaussian 

distribution. Further, the second term in (5.8) after scaling by T 2 converges to zero in probability 
as d —>• 00 for each n > 1. The previous two statements together imply that {da'^TsR — 2\\n\\^)/ 
converges weakly to a A^(0,4) distribution as d —>■ 00 for each n > 1. 

(b) The proof of this part of the theorem follows from arguments similar to those used in the proof 
of Theorem 2.2. □ 


Proof of Theorem 3.1. Without any loss of generality, we can take /xi = 0, so that p, = p, 2 . Let 
us write X,; = Vj and Yj = ^ + Wj, where Vj = ^ijPi and Wj = 'Wj/Qj for 1 < i < m 
and i < j <n. Let V = (Ui, U 2 ,..., Vd)' and W = {Wi, IU 2 ,..., Wd)'. Denote Sy = Disp{V), 
Sw = Disp(W), ay = VarfVi) and a^ = Var{V 2 ). 

(a) We will first derive the asymptotic distribution of Twmw- Using similar arguments as those 
used in proving (5.1), we get 


|x-y|P = 5 ; 


k=l 


Yi m_2v^ 

P2 ^ Q2 pQ 


, /V w\ 


(5.9) 
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Consider the event E = {||X —Y|p/(i—(fjy/P^ + cr^/Q^) = as d ^ oo}. It follows from 

Bradley (2005, Theorem 5.2(b)) that for any function h : M? ^ M., the sequence (h(I4, Wk) ■ k > 1) 
is / 9 -mixing with its mixing coefficient bounded by max{pi(-), /02(')}- Using this fact and (5.9) above 
along with the assumptions in the theorem and Theorem 8.2.2 in Lin and Lu (1996), we get that 
for any given e G ( 0 , 1 / 2 ), 


Pr{E\P,Q) = 1 


(5.10) 


for almost every P and Q. Now, 

1 


Twmw = 


(m) 2 (n)s 


+ 


E E 


(X„-Y,.)'(X„-Y 


J2J 


* 1^12 jl7^j2 ' V h 




12 


'w^h 


E E 


(m) 2 (n )2 L 


X,. -Y,.)'(X„ -Y 


n) 






\^i 2 -^n\ 


- 1 


frpW + )/(l 

y^WMW ^ 


(5.11) 


where 


and 


r, 


( 1 ) 

WMW 


1 V V (Xn-Y9l/(X.,-Y,J 

(m)2(n)2 d{alP-^ + alrQ^^YPia^Pr^ + 


P 


( 2 ) ^ 1 
WMW (771)2 (n)2 


E E 


U 7^*2 717^92 L ' ^ h 


(X„ - Y,.)'(X„ - Y„ ) 


H<ylPu‘ + 'yl-Qn)'H'ylPr:‘ + 


X 


12 


'W'^j 2 




'W^ji 

|Xn-Y,J 


12 

\^^ 2 -^n\ 


- 1 


Some straightforward algebra yields 


n(l) 

-WMW 


^( 777 ) 2 ( 77)2 


1 

^( 777 ) 2 ( 77)2 




il^i 2 




717^72 


[PnP^,]-^A,^,i,Y'^Y^, - 2 J][P,Q,]-ia,,V'(Q,/U +W,X5.12) 


*17^*2 


*.7 


+ '^[QjiQj2] ^-^71.72(^71/^ + Wjj'(( 5 j 2/7 + w 


727 7 ) 


717^72 


where ^ + ^^^^ 72 ^) ^ 71.72 = + 

^wQj2^ ^^‘^i^vPi2^P^wQji) =Ei2^nEj2^jS^vPii^P^wQj2'^ ^^‘^^^vPi2^P^wQji) 


U 2 . 
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Define 


^ >^h^i2h^j2 

It follows from the expression of in (5.12) that Qj, l<i<m,l<j<n) = 

Further, it can be shown that 

Var{dT^^j^^\Pi, Qj,l < i < m,l < j < n) 

= S2 + 77—tt“TT 9 


[M 2 (n) 2 ]^ 




+ 


= 52 + 


1 


[{m) 2 {n) 2 f 


y~l {/i p Pj^ )Pjl,j2i‘^Pjl,j2 P Pjl,j3 P Pj2,j3) 


J1 J2 J3 
all distinct 


[(m) 2 (n) 2 ] 


r{Li/r'Sy/r + L21j!TjwIJ‘}, 


(5.13) 


where Li = and L 2 = Here, 

the latter summation is taken over distinct indices ji,j 2 and js. Also, 


52 = 


[( 771 ) 2 ( 77 ) 2 ] 


2 j;; 

07^*2 


+ 2 ^ [Q,,g,,]-24_.^tr(s2^) +4 j;[P,g,]-2C'2^.tr(SyS^) \ , 

jl¥^j2 hj ) 

= {L3tr(S^) + L4tr(S^) + 2L5tr(Sv/Svy)}/[(7T7)2(77)2]^, 

where Lg = 2Y^.^^.^[Pi^Pi^]-^Al^i^, U = 2Y.j,^j^[QjiQj2]~^Blj2’ ^nd L5 = 2J2ij[PiQj]~^CP. 
Note that [Li^'T^v^J‘pL2^P^wiP) < max{Li, L2}/i'(Sy+Sw)l7. Also, S2 > [(777)2(77)2]“^ minlLg, L4, 
L5}[tr(Sy) + tr(S^) + 2 tr(EySvi/)] = [(777)2(77)2]“^ minlLg, L4, L5}tr[(Sv + T^w)\ These facts 
along with ( 5 . 13 ) and Assumption (C 3 ) imply that Var{dT^j^y^\Pi,Qj, 1 < 7 < 777 ,1 < j < 77) = 
52(1 + 0(1)) as d —>■ 00. Now, 


{dPwMW 


-Si)/Sl 


(X,,-Y,4+^)^(X,,-Y,,+/7) 
(777)2(77)2 P ^wQj‘!^Y^‘^{^vPi2 P 


E E 


(777)2(77) 


^a,,77'(Xi-Y,+/7) 




( 1 ) 


( 2 ) N/eV 2 


isT 


_ rpK'^) \ j C-*-/^ 

WMW ^WMW)I^2 ’ 


(5.14) 


where 


T, 


( 1 ) 


1 


WMW 


( 777 ) 2 ( 77)5 


E E 

*17^*2 jl7^i2 


(X,,-Y,,+/ 7 )^(X,,-Y,,+/ 7 ) 

dKp-^ p <^^wQnYH<^lPp,^ p 
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and T^mw = 2[(m)2(n)2] ^ Ylij -Yj + ^). It can be shown that E{T^'^J^y^,\Pi, Qj, 1 < 

i < m,l < j < n) = 0 and 

yar{f^^j^^\Pi,Qj, 1 <i <m,l < j <n) = 4[(m)2(n)2]“^ < ^ Cij^Cij^P~^ n'Ev^i 

\i, 311^32 

+ Cfj {^V / Pi + / Q]) Cii,jCi2,jQj‘^ 



So, using Assumption (C3) and arguments similar to those used earlier to show that Var{dT^^j^yy\Pi, 

Qj, 1 < * < 1 < j < n.) = S' 2 (l + o(l)) as d —>■ oo, we get that Var(T^^j^y^,\Pi, Qj,l < i < m,l < 

•~/ 2 ) 1/2 

j < n) = o{S 2 ) as d —> oo. Thus, Chebyshev’s inequality implies that Py\iMW !^2 converges to 

zero in probability as d ^ oo. 

Next note that 


^WMW 
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(m) 2 (n )2 


E E 

jl^j 2 


{Yn/Pn-W,JQ,,y{V,jPi,-W,JQj,) \ 
+ a^^Q-^Y^alPr^ + / 
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(m) 2 (n); 


EE E 

k=i 11^12 ii^i2 


f(Qjl^ilfc - PiiWjik){Qj2yi2k - -Pi2^j2fc) 


It is easy to see that E{P 


( 1 ) 


WMW 


Pi^QjY E i E rn,l < j < n) = 0. Further, from algebraic 


computations similar to those used earlier in deriving Var{dT^j^^r\Pi, Qj, lE*E'm, lEjE n), 
it can be shown that Var{T^^J^^^\Pi,Qj,l < i < m,l Is j < n) = 82 - Thus, by Theorem 4.0.1 
in Lin and Lu (1996) and Assumption (C4), the conditional distribution of 'PwMw/p 2 ^^ gi’^en the 


Pi's and the Qj's converges to a standard Gaussian distribution as d —)• 00 . This fact along with 

~/ 2 'i 1/2 

(5.14) and the fact that conditionally on the Pj’s and the Qj's, T^mw/ ^2 converges to zero in 


probability as d —)• 00 yield 


lim P{{dT^^j^yy 

a —>^00 


'Si)/Sl^^ <x} = 4>(x). 


(5.15) 


20 






Next, let us write 


r(2) 

'WMW 







'w^ji 


i 2 


'W^j 2 




IXjj Yj 2 I 


- 1 


(m) 2 (n) 


E E 


h¥=i 2 ji¥=j 2 ^ *1 


(4^,7^ + )''"(4p.7" + 




_ "Ji 

|X.n-Y,J 


12 

1^*2 ^i2 I 


'W'^j 2 


- 1 


_ 7^(3) I 7 - 1 ( 4 ) 

“ ^WMW ^WMW^ 


( 5 . 16 ) 


where 


n(3) _ 

-WMW — 


_ I _V V 

(m) 2 (n )2 

^IT'^2 J1TJ2 L 


(Xn-Y,J'(X,,-Y,J- 


^ ^ (4Pr4 + ^wQn444Pi4 + ‘’wQi'‘y^ 
44P.4 + <-Qjy'44P,4 + 


|Xi, -Y,J 


|x,, -Y„| 


- 1 


and 


n( 4 ) _ 

-WMW 


{m) 2 {n )-2 


E E 

* 174*2 ii74i2 


(4P.4 + 4vQ4y'H4P.4 + 


X 


'W^j 2 


dioip^+‘^iQiy4‘’iPi4+ 


|x.. - Y 


n\ 


^2 _ 

1^*2 I 


- 1 


As mentioned earlier, S2 > [(771)2(74)2]”^min{L3,L4,L5}tr[(Sy + Svi/)^]- Moreover, the stationar- 
ity of the sequences X and y and the Cauchy-Schwarz inequality imply that tr[(Sy + Sw)^] > 
(i(f7y + cr^)^. These facts along with ( 5 . 10 ) and Assumption (C 3 ) imply that conditionally on 
the PiS and the Qj’s, each term inside the double sum appearing in above is op{Sy‘^) as 

d —)■ 00. So, T^mw !converges to zero in probability as d —>■ 00. 

Next, fix any ii 7^ 72 and ji 7^ j2 and consider the corresponding term inside the dou¬ 
ble summation appearing in the expression of T^j^yy. It follows from ( 5 . 10 ) that d{ayP~'^ + 
p ^wQy2y ^‘^- Y^iH HX^j - Y^JII - 1 converges to zero in probability as 
d —)■ 00. Also, note that 


(X,, - Y,J'(X,, - Y,,) - ll/ijp = (X,, - Y,, + /i)'(X7, - Y,, + /z) 

~ 17 (Xjj — Yjj P n) — fj, (Xjj — Yjj -|- /z) 


E 


{QnVhk - PilWj^k){Qj2Vi2k - Pi2^j2k) \ 


PilPpQ jiQ j 


^^4 V - *1-* t2^Jl'^]2 

- yiQjl^h - Ph^jl)p/{PilQjl) - P'{Q32^12 - Pi2^h)l^yPi2Q32) 


/ 


( 5 . 17 ) 


( 5 . 18 ) 
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It is easy to show that the conditional expectation of the first term in (5.18) given the Pi's and 

the Qj's is zero, and its conditional variance is Vi^i^j^j^ = [Pi^Pi 2 ]~'^tr{T,y) + + 

{[PiiQj 2 ]~'^ + [Pi^Qj^]~'^}ii(J^yTjw)- So, = 0(tr[(Sy + Sw)^])- Hence, using the fact that 

S2 > {{TTi)2{n)2\~‘^ min{L3, L4, L5}tr[(S\/ + and Chebyshev’s inequality, it follows that the 

first term in (5.18) after scaling by S 2 is bounded in probability, conditional on the Pi's and the 

Qj's, as d —>■ 00 . Using Assumption (C3), Chebyshev’s inequality and arguments similar to those 

used to prove the convergence in probability to zero of earlier, we get that the second and 

1 /2 

the third terms in (5.18) after scaling by S 2 converge to zero in probability as d —)• 00 . So, the left 

1 /2 

hand side of the equation (5.17) after scaling by S 2 is bounded in probability, conditional on the 

1 /O 

Pi's and the Qj's, as d ^ 00 . Thus, /S 2 converges to zero in probability as d —)■ 00 . This 

along with (5.16) and the fact that converges to zero in probability as d —)• 00 together 

imply that /S 2 converges to zero in probability as d —)• 00 . Combining this fact with (5.15) 

and (5.11), we get lim.d -,00 P {{dTw mw - lll^lP'S’i)/*?^'^^ ^ x\Pi,Qj,l < i < m,l < j < n} = <h(x) 
for all X € M and for each m,n > 1 . Consequently, 

hm PiidTwMW - M\^Si)/Sl/^ <x} = $(x) 

d—)-oo 

for all X € M and for each m,n > 1 . 

( 2 ) ( 2 ) 

We now derive the asymptotic distribution of T^q- As in the proof of Theorem 2.1, T^q = Ti — 

T 2 . In the setup of the present theorem, Ti = [{m)2{n)2]~^ Ylji^j 2 (^ii/Pii-^ji/Qjiy{^i 2 /Pi 2 - 

^ 32 /Qi 2 y andr 2 = 2 {mn)~^ Y.i,j l^'(Yi/j/Qj)- So, E{Ti\Pi,Qj, 1 < z < m, 1 < j < n) = 0 . 

( 2 ) 

Further, from algebraic computations similar to those used to derive the variance of PcQ in the 
proof of Theorem 2.1, it follows that 


Var{Ti\Pi,Qj, I < i < m,l < j < n) = ^ W X] Y^Ph] ^tr(Sy) 

[(m)2(n)2j^ 1 

+2 ^ [Qj.Qj.YhvYl,) + AY.YiQjrMP^vEw) 

jl¥^j2 i,j 

Dehne = Var{Ti\Pi, Qj,l < i < m,l < j < n). Also, E{T 2 \Pi, Qj,l < i < rn,l < j < n) = 0, 
and Uar(T 2 |Pi, Qj, 1 < i < m,l < j < n) = o^S^) as d —>■ 00 using the assumptions in the 
theorem. Thus, T 2 / 53 ' converges in probability to zero as d —>■ 00. Further, using arguments 
similar to those used to prove the asymptotic Gaussianity of above, it follows that the 

conditional distribution of Ti/S^ given the Pi's and the Qj's converges weakly to a standard 
Gaussian distribution as d — >■ 00 for all m, n > 1. Combining these facts, we have 

mn p{(t^2^ - <x} = $(x) 

for all X G M and all m, n > 1. 
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(b) Note that Si is a real valued ^-statistic whose kernel + cryQj^) + 

has finite expectation V’l = E‘^{PQ by the assumption in the 


theorem. Thus, it follows that Si converges almost surely to t/’i. Define ^21 = [(m)2{(n)2}^]~^T3, 
S22 = [{{fn)2Y{n)2]~^L4^ and S2Z = [mn{m — l)^(n - 1)^]“^L5. Each of 52 i, S22 and S22, is a 
real valued E-statistic whose kernel is bounded and thus has finite expectation. So, there ex¬ 
ist V’21) V’22 and 'i/’23 depending only on the distributions of P and Q such that S'21, S22 and 
S21, converge almost surely to 11:21, '4’22 and ^^23, respectively. Here, 'i/’2i = P‘^{QiQ2/[{(^vQ2 + 
f^wPi)i^vQ2 + ' 4’22 = E'^{PiP2/[{al,Qj -h a'^P^){cr‘^Ql + and 1P23 = 

[i>2i'tp22V^'^■ Define -02 = 2 tr(S^)'i/' 2 i/(W ')2 + 2 tr(S^)' 022 /(ra )2 + 4 tr(SySn/)' 023 /("i^)- Recall that 
S2 = 2 tr(Sy)S' 2 i/(m) 2 -|- 2 tr(S^) 522 /(ra )2 + 4 tr(SySvy) 523 /(mn). Conditions (Cl) and (C 2 ) along 
with Theorem 2 . 1.5 in Lin and Lu ( 1996 ) imply that both V and W possess continuous spectral 
densities. Now, the proof of Theorem 18 . 2.1 in Ibragimov and Linnik ( 1971 ) implies that each of 
tr(Sy), tr(S^) and tr(SySn/) equals a constant multiple of d plus a remainder term, which is 
o{d) as d — 7 > 00. Thus, for each hxed m,n> 1 , there exist constants Ai, A2 and A3 such that with 
probability one 


lim ^ {m)2 21P22A2I (n)2 4^23^3/ (mn) 

S2 2S2iAi/{m)2 + 2S'22^2/(ra)2 + 4523^3/ {mn) 


( 5 . 19 ) 


We denote the right hand side of ( 5 . 19 ) by Rm,n- Further, the assumption in the theorem and 
arguments preceding ( 5 . 19 ) imply that ||/i|p/' 02 '^^ converges to a finite non-negative limit 6 ^ (say) 
as d ^ 00. Now, 


lim lim P 

m,n—>-oo d—>-oo 


= lim lim P 

m,n—>-oo d—>-cx) 


I dPwMW - ll/^II^V'l 




< X 


^ xili^^ ||p|p(5i - V’l) 


c,i/2 - jji/2 ^ 1/2 

02 02 ^2 


lim E 

m,n—>-oo 


lim E 

m,Ti—>-cx) 


lim P 

d^oo 


f dPwMW - llhll^_^ ^ xipy^ 


'{Si - -01) 


U/2 


5 . 


1/2 


1/2 


i^r 


d> I lim 21 ^ Jx - {Si- ipi) lim 

d—)-oo I d—^oc 


d^oo ^ 1/2 


s. 


\Pls,Q's 


P'iS,Q'.s 


= E 


lim (^{Rra,n{x - {Si - 'i/’l) 6 ^})jT)'s, Q' s 
m,n—>-oo 


= 4>(x), 


where the last equality above follows since Rm,n converges to one and Si — ipi converges to zero 
almost surely as m, n —>■ 00. 

Note that [(m)2{(n)2}2]-i [{n) 2 {{rn) 2 Y]-^Y.njij 2 ^QhQh]~‘^ and [mn{m- 

l)^(n — 1 )^]”^ appearing in the expression of S3 converge to E'^{P~‘^), E'^{Q~‘^) 

and E{P~'^)E{Q~‘^), respectively, as m,n —>■ 00. Also note that Si = Disp{X.) = 'PyE{P~‘^) 
and S2 = DispiY) = EwE{Q~‘^). So, arguing as in the case of S2 above, we get that S'3/ri 
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converges in probability to one as first d —oo and then m,n ^ cx). Thus, it follows that 
limd^oo P{{Tcq - ll/^lP)/rt^ < x} = $(x) for all x G M. □ 

Proof of Theorem 3.2. Since Y is distributed as X + /x, we have ifi = ay‘^E'^{PQ/{P‘^ + Q'^Y^} 
and '02 = ['Xy-E(-P“^)]“^-E^{QiQ 2 /[(-Pf + Q\Y^‘^{Pl + Here, 0i and 02 are as in 

the proof of Theorem 3.1. Since hmm,n^oohmrf_).oo H/xlP/T^^^ = c for some c € (0,oo), we have 
lim^ .,71^00 limrf^oo / 3 ^( 2 ) (/x) = $(-Ca + c), and 

J r’lO 


lim lim 07 ’„ 
m,n^oo d^oo 


■(h) — —Ca + 


C£;(P- 2 )£; 2 {PQ/(P 2 + q 2 ^ 1 / 2 | 

i?{ClQ 2 /[(Hi' + Qf)H 2 (p 2 + q 2 ) 1 / 2 ]} 


Now, E^{QiQ 2 /[{Pl + QlYP{Pi + QlY'^]} = E[E‘^{Qi/{Pl + Qf)H2|P,}] < £:[T;{Q 2 /(p 2 + 
Ci)l-Hi}] = P{Q\/{PiPQ\)} — 1/2- Here, the inequality can be obtained using Jensen’s inequality. 
Further, E 2 {Pg/(p 2 +Q 2 )l/ 2 | > p- 2 {(p 2 ^g 2 )l/ 2 /pg| > p-l{(p 2 ^g 2 )/p 2 g 2 | ^ l/{p(p- 2 ) + 

E{Q~‘^)} = [ 2 ill(P“ 2 )j-i_ Here, the inequalities follow from Cauchy-Schwarz inequality. Combining 


the previous two inequalities, we get lim^,n-^oo hm^^oo(h) > 1 ™^ 


) hmd^oo/3p2) (/x). 


‘CQ 


□ 


Proof of Theorem 3.3. (a) Let us consider the conditional distribution of Tsr given Pi, P 2 ,..., Pn. 
By definition, 


Tsr = 


1 


(Hjj Vj^ + Vj2 + 2/xPj^Pj2)0Pi4 Vjg + Pi^Vi^ + 2 fj.Pi^PiY 
(XX)4 .... llPjgVjj + PjgVj2 + 2/xPjjPj2|| ||Pj4Vj3 + PjgVj^ + 2/xPj3Pj4|| 

^15^2 i^4 

all distinct 


Consider the event E = {h“^||P 2 Vi + P 1 V 2 + 2 /xPiP 2 |p — (Ty(p2 + P^) = o(d~^P~^Y as d 00 }. 
From Theorem 8.2.2 in Lin and Lu (1996) and the assumptions in the theorem, it follows that for 
any given e G (0,1/2), Pr(P|Pi, P 2 ) = 1 for almost every Pi and P 2 . Let us rewrite Trr as 


where 


Tsr 


1 


E 


+ 


(ra)4 ^ 

^ y 

(n)4 ^ 

* 17^*2 7^*3 7^*4 


(Hi 2 Vxi + Pq Vx 2 . 

d<^UP^,+Pl 


+ ^hPiiPi 2 ) 


PioY{Pi4^i3 + PiY^ii + ‘^TP^Pu) 


P.2)l/2('p2 

12 ' ' *3 



(HjjVjg + PifVi.^ + 2/xPqPj2)0Pi4 Vjg + Pi^Wi^ 

. d^iipi+piy'^ipi+piy^ 


Pia'^H + ‘^h-PiaPii) 


|Ci 2 + Pj^ Vj 2 


daUPl+P^Y^Hp^ + PlY^^ _ 

PiY^ia + Pia^u p “^h-PiaPiY 


+ 2 /xPjgPj 21 


(T™ + r®)/d, 


1 


p(i) ^ _L V 

(n)4 . .^ 


HijVj 2 + 2/xPqPj2)0Pi4 Vjg + Pi^'Vi^ + 2^Pi^PiY) 


(Hi2 Vii + Pii V 22 ^hX'uX'i2l l-f'M V »3 2 

+y.)'/^(Cyy 4 


(5.20) 
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and 


7^(2) _ 

^SR~ 


{n)4 


E 




{Pi2^il + Pii^i2 + ‘^l^PiiPi2)'{Pii^h + Ph^ii + ^t^PisPu) 


daliPl + Piy^Pl + f>2jl/2 


I|-Pi4^i3 + Piz^iA + ^/iPjgPj 


- 1 


*4 I 


Recall that a similar decomposition of Tsr was obtained in the proof of Theorem 2.3. The proof of 
the asymptotic Gaussianity of Trr follows from the ideas nsed to prove the asymptotic Gaussianity 
of Twmw in Theorem 3.1, and the details are provided in Appendix ~ II. Z 2 and Z 3 appearing 
in the asymptotic Ganssian distribntion of Tsr are given by Z 2 = 2 [(n) 4 C 7 y]“^ 
and Z 3 = 8 tr(S^)[(n) 4 CT ^]“2 where Pi,PiJ[{Pl + Pl)^^^iPl + 

Pirn- 

The proof of the asymptotic Ganssianity of Ts will follow from arguments similar to those used 
to prove the asymptotic Gaussianity of Tsr, and we skip the details. Zi and r 3 in the asymptotic 
distribution of Ts are given by Zi = [(n) 2 <T^]“^ PhP^ and r 3 = 2 tr(E^)/[(n) 2 <T^]. 

The proof of the asymptotic Gaussianity of T^q is also provided in Appendix - II, and Z 4 
appearing in its asymptotic distribution is given by Z 4 = 2 tr(Sy)[(n) 2 ]“^ [-^* 1 -^* 2 ] 

(b) Observe that Zi, Z 2 , {n) 4 Z ‘4 and (n) 2^4 are real-valued R-statistics, whose kernels have finite 
expectations by the assumption in part (b) of the theorem. So, they converge almost surely as 
n —> 00 . The corresponding limits are 9i = E‘^{Pi)/ay, 62 = E‘^{PiP 2 /{Pf + P^Y^}/oy, 6*3 = 
ir{T?y)E‘^{P 2 P^/{Pl + Pi)P‘^{Pl + PiYP}/a\r and 64 = 2 tr(E^)£; 2 (p- 2 ), Note that since E{p-^) 
is hnite, we have S = Disp(X.) = EyE{P~‘^) and cr^ = Var{Xi) = ayE{P~‘^). So, 64 = 2tr(S^). 
Arguments similar to those used in the proof of part (b) of Theorem 3.1 complete the proof of part 

(b) of the present theorem. 

(c) Suppose that lim^^oo limrf_^oo = c for some c G (0, 00 ). Then, 


lim hm ^TsilA = H-Ca + cE^{Pi)E{P^^)), 

n —>-00 d^QO 


lim lim I3tsr{t) = ^(“Ca + 


n —>-00 d -^00 


cE‘^{P4P2/{PI + PiYP}E{P^^) 

E{P 2 P^/[{Pl + P |) V2 ( p2 + ^ 32 ) 1 / 2 ]}^’ 


lim lim /3 (p (/x) = d>(-Ca + c). 
n^oo d^oo ^CQ 


Now, from Jensen’s inequality, we have E^{Pi) > E ^{Pi ) > E ), which implies that 

p2(p^)p(p-2) > 1 . Thus, hm„^oo limd^oo/^Ts(it) > hmn -,.00 hiUrf^oo/3pi) (it)- The proof of the 

^CQ 

other part of the theorem is similar to the proof of Theorem 3.2. □ 
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Appendix — II 

Additional mathematical details related to the proof of part (a) of Theorem 3.3 


Here, we provide more details related to the derivations of the asymptotic distributions of Tsr, Tr 
and T^q under the assumptions of Theorem 3.3. Recall that Tsr = + where and 

are dehned in the proof of Theorem 3.3 in Appendix -1. Define + 

Hjg)(T’jg + Then, by the definition of we have 


T^(l) 
^ SR 


4 

{n)^al 




It follows easily that E(T^^l\Pi,l < i < n) = 4||/r|p[(n)4cry] Set Z 2 = 

2[(n)4cry]“^ 174^^*2^11 Ti 2 - Further, it can be shown using the assumptions in the theorem 

that Var{T^gl\Pi, I < i < n) = 32tr(S^)[(n)4cr^]“2 + ^(l)) as d 00 . Let Z 3 = 

8tr(S^)[(n)4a^]"2Eii^i2 Pfi *2 • using arguments similar to those used to prove the asymp¬ 
totic Gaussianity of in the proof of Theorem 3.1, it follows that — 2\\^\\^Z 2 )/{2Zy^) 

converges weakly to a standard Gaussian distribution as d —>• 00 for each re > 1. Moreover, using 

( 2 ) 

arguments similar to those used to prove the convergence in probability of T^'j^fiy in the proof of 
Theorem 3.1, it follows that Tg^jZ^ converges to zero in probability as d —>• 00 for each re > 1. 
This fact along with the equation Trr = {T^r + P^R)/d and the asymptotic Gaussianity of 
yields 


hm P{{dTsR - 2M^Z2)/{2 zIP) <x} = $(a:) 

d^oo 


for all X € M and each re > 1. Here, <I> is the standard Gaussian cumulative distribution function. 
Using very similar arguments as above, we get that 

hm P{(dr 5 - \\fifZi)/rlP <x} = <i>(x) 

d^oo 

for all X G M and each re > 1 , where Zi = [{n) 2 (yy]~^ T 3 = 2 tr(Sy)/[(re) 2 iTy]. 

Next, consider the conditional distribution of t'qq given the Pj’s, and note that 


rpW 

^CQ 


1 (Vq + l^Piiy(^i2 + IJ'Ph) 
(n )2 PnPi, 

2 - + 


(re) 


* 17^*2 


re 


Pi 


So, £'(^^1^4,1 < i < n) = ||//|p, and Var{T^Q\Pi,l < i < n) = £ 4(1 -|-o(l)) as d —)> 00 , 

where Z 4 = 2 tr(Sy)[(re) 2 ]~^ Using the assumptions in the theorem, it follows 

2 /2 

that conditional on the Pj’s, Yli Pi — op{Z^ ) as d —)■ 00 . Thus, we get 

hm - ||/i|p)/zE < 4 = 


for all X G M and each re > 1. 
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Appendix — III 

Detailed results of the simulation study done in Section 4 

Here, we present the results on the sizes and the powers of the tests based on Tskk (Srivastava et ah, 

2013) and Tqcbl (Gregory et ah, 2014) discussed in Section 4. We also present the sizes and the 

powers of the test in Cai et ah (2014) for which the test statistic is denoted by Tclx- Table 

2 reports the sizes of these tests implemented using the asymptotic approximations given in their 

original papers under the models considered in subsections 2.1 and 3.1 of our paper. We also report 

the sizes of the tests implemented using the permutation distributions of these test statistics. 

Note that we could not implement the test based on Tclx using its permutation distribution 

because the test procedure uses a computationally intensive optimization. For the same reason, we 

could not implement this test for d = 1600 under any of the above models using the asymptotic 

distribution given in Cai et ah (2014). Recall that we have discussed the sizes of the tests based 
( 2 ) 

on Twmw and T^q for the above models in detail in subsections 2.1 and 3.1. 





Figure 3: Powers of the tests at nominal 5% level based on Twmw (- + - curves), T^q (- o - curves), 
Tskk (- x - curves) and Tgcbl (- A - curves) for the AR{1) model with Gaussian innovation (left 
panel), the AR{1) model with t(5) innovation (middle panel) and the spherical t(5) distribution 
(right panel). 

In Figure 3, we give the plots of the empirical powers of the tests based on Tskk and Tgcbl, 

when they are implemented using their permutation distributions. Each plot in Figure 3 also 

( 2 ) 

includes the empirical powers of the tests based on Twmw and T^q. The power curves for these 
two tests are so close that they are overlaid on each other in the left and the middle plots. Similarly, 
the power curves corresponding to the tests based on Tskk and Tqcbl are overlaid on each other 
in all the plots. 

( 2 ) 

Figure 4 gives the plots of the empirical powers of the tests based on T\ymw, T^cq Tclx, 
when the mean shifts in the models considered in subsections 2.1 and 3.1 are distributed equally 
among all the coordinates. Once again, the power curves corresponding to the tests based on 
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Table 2: Sizes of the tests based on Tskk, Tgcbl and Tqlx under some simulated models 


Test 

d 

AR{1) with 

Gaussian innovation 

AR{\) with 
t{5) innovation 

spherical t(5) 

distribution 


100 

0.06 

0.064 

0.011 


200 

0.068 

0.06 

0.001 

TsKK~OT\gUia\. 

400 

0.071 

0.072 

0 


800 

0.089 

0.089 

0 


1600 

0.101 

0.089 

0 


100 

0.045 

0.039 

0.043 


200 

0.047 

0.048 

0.044 

permutation 

400 

0.054 

0.043 

0.039 


800 

0.048 

0.052 

0.049 


1600 

0.042 

0.054 

0.051 


100 

0.077 

0.071 

0.137 


200 

0.075 

0.078 

0.148 

TbcBL-original 

400 

0.086 

0.081 

0.141 


800 

0.125 

0.134 

0.152 


1600 

0.164 

0.152 

0.185 


100 

0.042 

0.048 

0.046 


200 

0.051 

0.042 

0.044 

TbcsL^permutation 

400 

0.05 

0.056 

0.038 


800 

0.046 

0.047 

0.042 


1600 

0.055 

0.047 

0.039 


100 

0.082 

0.075 

0.076 


200 

0.101 

0.114 

0.093 

TcLX-original 

400 

0.136 

0.147 

0.105 


800 

0.167 

0.184 

0.131 


f2) 

Twmw and T^q are sufficiently close making the curves overlaid on each other in the left and the 
middle plots. 

( 2 ) 

In Figure 5, we give the sizes and the powers of the tests based on Twmw and T^q for the 
multivariate Gaussian distribution with dispersion matrix (1 —/3)/rf+/31rfl^ with /3 = 0.7 considered 
in Section 4. 
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(2i') 

Figure 4: Powers of the tests at nominal 5% level based on Twmw (- + - curves), T^q (- o - curves) 
and Tclx (- x - curves) for the AR{\) model with Gaussian innovation (left panel), the AR{1) 
model with t{5) innovation (middle panel) and the spherical t{5) distribution (right panel). 




Figure 5: Powers of the tests at nominal 5% level based on Twmw (- + - curves) and (- o - 

curves) for the multivariate Gaussian distribution with dispersion matrix (1 — with 

/? = 0.7. 
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